Description

The data were collected during a study of the settlement pattern of common terns on a small islet in the Delta d’Ebre (Hernandez and Ruiz, range3), particularly in the mouths of the Ebre river. The islet was inspected at two-day intervals throughout the range0 breeding season. The data include the location of each nest, its elevation above sea level, and elevations at a number of additional points (points without nest) on the islet. In the file called elevationsIslet.txt, contains the information of the coordinates and elevation above sea, and in file, called poly84.txt contains the coordinates of the borders of the islet. The aim is to predict the elevation above sea level along the small islet using a kriging interpolation.

1) Explore the requirement of stationary mean of the process. In case this requirement is not met, detrend the data to ensure that the process is stationary in mean. Discuss the results and show the plot of the results

Firstly, we plot the locations to see the spatial distribution of the data

We also look at the distribution in relation to the x-coordinates (E-W) and y-coordinates(N-S).

The plots show a concentration of high values in the extreme east and west of the islet and we can observe a grater density in the north.

The process doesn’t seem stationary, indeed there is a clear quadratic trend along the x direction. In addition we try using the y direction as regressor, to find a possible linear or quadratic trend.

## 
## Call:
## lm(formula = data ~ x + y + I(x^2) + I(y^2), data = dataset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -22.639  -6.417  -0.964   6.024  22.744 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 17.4502678  2.8429177   6.138 7.78e-09 ***
## x           -0.4938096  0.1103379  -4.475 1.55e-05 ***
## y            0.0349038  0.0570014   0.612    0.541    
## I(x^2)       0.0040706  0.0009267   4.393 2.17e-05 ***
## I(y^2)       0.0019508  0.0008841   2.206    0.029 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.19 on 143 degrees of freedom
## Multiple R-squared:  0.2026, Adjusted R-squared:  0.1803 
## F-statistic: 9.082 on 4 and 143 DF,  p-value: 1.446e-06

The linear term in y doesn’t seem significant, so we remove it.

Now we have obtained a model in which all regressors seem to be significant. We save the residuals of our linear model and look at the de-trended data:

What we now obtain is a more homogeneous distribution of the data around the value 0.

We don’t notice any particular trend so this new dataset seems stationary!

2) Explore the spatial dependence of the elevation variable using the variogram cloud and bins and the empirical variogram. Discuss the results and plot them

We try to analyse the spatial dependence through a Variogram Cloud, using the Robust Estimators (since the classical one might consider outliers those values which are not necessarily outliers)

## variog: computing omnidirectional variogram

The variability at small distances appears a bit greater than that for larger distances. => We reduce the density of the plot by reducing the maximum distance over which the variance are calculate.

## variog: computing omnidirectional variogram

## variog: computing omnidirectional variogram

##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

=> All the bins have at least pairs.min=30 observations each, indeed they have:

##  [1] 422 650 731 755 660 514 460 473 497 447 457 415 377

Empirical Variogram

## variog: computing omnidirectional variogram

In the variogram of the residuals the values increases until certain distance, and then they keep constant. That’s ok since it’s the behaviour that we expect from a stationary variogram.

3) Check the hypothesis of the spatial independence

To check the hypothesis of the spatial independence we use a Monte Carlo approach

## variog.env: generating 3000 simulations by permutating data values
## variog.env: computing the empirical variogram for the 3000 simulations
## variog.env: computing the envelops

All the values of the empirical variogram are inside the envelope, therefore the process has no spatial dependence.

4) Check the isotropy property of the process. Comment the results, it’s not necessary to overcome the anisotropy.

To check the anisotropy we need to compute the directional variogram in the 4 main directions: 0º, 45º,90º and 135º.

## variog: computing variogram for direction = 0 degrees (0 radians)
##         tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing variogram for direction = 45 degrees (0.785 radians)
##         tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing variogram for direction = 90 degrees (1.571 radians)
##         tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing variogram for direction = 135 degrees (2.356 radians)
##         tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing omnidirectional variogram

The directional variograms don’t seem to be perfectly overlapping: they might have the a different sill, in particular the variogram seems to have a higher value along the 90° direction => Geometrical anistropy

Let’s also analyse the range observing the Rose Diagram:

## variog: computing variogram for direction = 45 degrees (0.785 radians)
##         tolerance angle = 45 degrees (0.785 radians)
## variog: computing variogram for direction = 90 degrees (1.571 radians)
##         tolerance angle = 45 degrees (0.785 radians)
## variog: computing variogram for direction = 135 degrees (2.356 radians)
##         tolerance angle = 45 degrees (0.785 radians)
## variog: computing variogram for direction = 180 degrees (3.142 radians)
##         tolerance angle = 45 degrees (0.785 radians)

We can observe that the Rose diagram is more or less a circle. At different directions, the variance is reached more or less at the same range. Then, this process is isotropic.

OPPURE

We notice that the Rose diagram is not perfectly circular: it is slightly elliptical, with major range in the 45 direction => Zonal anistropy.

=> In conclusion we have a combined anisotropy, so we can’t make the assumption of isotropic process, which is necessary for the correct use of the kriging techniques, since the theoretical variograms used for krigring are based on isotropic models.

5) Propose four theoretical variograms and estimate the parameters via restricted maximum likelihood or weighed least square. Select the two variograms which best fit the data. Explain the parameters of the chosen variogram (sill, nugget, range and kappa).

## variog: computing omnidirectional variogram

We’ll use a Maximum Likelihood approach

Exponential

## likfit: estimated model parameters:
##     beta    tausq  sigmasq      phi 
## "-2.631" "34.598" "71.657" " 6.083" 
## Practical Range with cor=0.05 for asymptotic range: 18.2225
## 
## likfit: maximised log-likelihood = -531.4

Gaussian

## likfit: estimated model parameters:
##     beta    tausq  sigmasq      phi 
## "-2.786" "41.157" "66.712" " 5.601" 
## Practical Range with cor=0.05 for asymptotic range: 9.693811
## 
## likfit: maximised log-likelihood = -529.8

Spherical

## likfit: estimated model parameters:
##     beta    tausq  sigmasq      phi 
## "-2.758" "36.363" "69.395" "12.992" 
## Practical Range with cor=0.05 for asymptotic range: 12.99251
## 
## likfit: maximised log-likelihood = -530.3

Matern

## likfit: estimated model parameters:
##      beta     tausq   sigmasq       phi     kappa 
## "-2.7801" "40.9846" "66.7781" " 0.5636" "25.1228" 
## Practical Range with cor=0.05 for asymptotic range: 9.875097
## 
## likfit: maximised log-likelihood = -529.8

All the simulated variograms contain the empirical variogram so we don’t have evidence to exclude any model.

Let’s compare the different models

The two fitted models with the highest loglikelihood are the Matern model and the Gaussian model. So we’ll use these two to perform the kriging prediction step.

6) Predict the elevations along all the area of study using the two variogram selected in point 4. Discuss the type of kriging chosen:

a. Compare both kriging predictions using cross-validation, and propose the best model.

b. Show the predictions and their standard errors.

First werate a grid where we’ll perform our kriging predicitons

Gaussian

Let’s try using the gaussian model to do our prediction step:

## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood

## xvalid: number of data locations       = 148
## xvalid: number of validation locations = 148
## xvalid: performing cross-validation at location ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 
## xvalid: end of cross-validation
## [1] "data"      "predicted" "krige.var" "error"     "std.error" "prob"

Matern

## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood

## xvalid: number of data locations       = 148
## xvalid: number of validation locations = 148
## xvalid: performing cross-validation at location ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 
## xvalid: end of cross-validation